Boosting for multiclass semi-supervised learning

نویسندگان

  • Jafar Tanha
  • Maarten van Someren
  • Hamideh Afsarmanesh
چکیده

Supervised learning methods are effective when there are sufficient labeled instances. In many applications, such as object detection, document and web-page categorization, labeled instances however are difficult, expensive, or time consuming to obtain because they require empirical research or experienced human annotators. Semi-supervised learning algorithms use not only the labeled data but also the unlabeled data to build a classifier. The goal of semi-supervised learning is to use unlabeled instances and combine the information in the unlabeled examples with the explicit classification information of labeled examples for improving the classification performance. Most of the semi-supervised learning algorithms were designed for binary classification problems. However, many practical domains, for example recognition of speech, objects, and characters, involve more than two classes. A Multiclass classification problem can be decomposed into a number of independent binary classification problems by utilizing methods like one-versus-all. However, these schemes have their problems. One-versus-all results in imbalanced distributions. Since each classifier is trained independently, the weights of their outputs may be on different scales, so that combining them is non-trivial. There is thus a need for direct multiclass algorithm for semi-supervised learning. In this paper we propose a new algorithm for Multiclass semi-supervised learning that follows the boosting approach and is a direct generalization of the binary SemiBoost algorithm [3], which uses both the similarity between the points and the classifier predictions to sample and assign “pseudo-labels” to the unlabeled examples, to the multiclass setting, named as Multiclass SemiBoost. The key advantage of Multiclass SemiBoost is to exploit both the manifold and the cluster assumption to train the classifiers using boosting. We derive the algorithm from an objective function that combines empirical loss on the labeled data and inconsistency of

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiclass Semi-supervised Boosting Using Different Distance Metrics

The goal of this thesis project is to build an effective multiclass classifier which can be trained with a small amount of labeled data and a large pool of unlabeled data by applying semi-supervised learning in a boosting framework. Boosting refers to a general method of producing a very accurate classifier by combining rough and moderately inaccurate classifiers. It has attracted a significant...

متن کامل

Semi-Supervised Boosting for Multi-Class Classification

Most semi-supervised learning algorithms have been designed for binary classification, and are extended to multi-class classification by approaches such as one-against-the-rest. The main shortcoming of these approaches is that they are unable to exploit the fact that each example is only assigned to one class. Additional problems with extending semisupervised binary classifiers to multi-class p...

متن کامل

ManifoldBoost: Stagewise Function Approximation for Fully-, Semi- and Un-supervised Learning

We introduce a boosting framework to solve a classification problem with added manifold and ambient regularization costs. It allows for a natural extension of boosting into both semisupervised problems and unsupervised problems. The augmented cost is minimized in a greedy, stagewise functional minimization procedure as in GradientBoost. Our method provides insights into generalization issues in...

متن کامل

Tracking-Based Semi-Supervised Learning

In this paper, we consider a semi-supervised approach to the problem of track classification in dense 3D range data. This problem involves the classification of objects that have been segmented and tracked without the use of a class-specific tracker. We propose a method based on the EM algorithm: iteratively 1) train a classifier, and 2) extract useful training examples from unlabeled data by e...

متن کامل

UvA - DARE ( Digital Academic Repository ) Boosting for Multiclass Semi - Supervised Learning

Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2014